计算机与现代化 ›› 2011, Vol. 1 ›› Issue (11): 55-4.doi: 10.3969/j.issn.1006-2475.2011.11.015

• 中文信息技术 • 上一篇    下一篇

条件随机场模型的应用研究及改进

姜文志1,顾佼佼1,胡文萱2,栗 飞3   

  1. 1.海军航空工程学院兵器科学与技术系,山东烟台264001; 2.海军航空工程学院外训系,山东烟台264001;3.海军航空工程学院指挥系,山东烟台264001
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-11-28 发布日期:2011-11-28

Research on Applications of Conditional Random Fields and Its Improvement

JIANG Wen-zhi1, GU Jiao-jiao1, HU Wen-xuan2, LI Fei3   

  1. (1.Department of Ordnance Science and Technology, Naval Aeronautical and Astronautical University, Yantai 264001, China;2.Department of Foreign Training, Naval Aeronautical and Astronautical University, Yantai 264001, China;3.Department of Command, Naval Aeronautical and Astronautical University, Yantai 264001, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-11-28 Published:2011-11-28

摘要:

近些年来,条件概率模型的研究得到了很大的发展。在对序列标注类问题进行处理时,条件模型逐渐开始取代产生式模型,其应用领域相当广泛,条件概率模型可应用到图像识别、自然语言处理、入侵检测等问题上。条件随机场模型(Conditional Random Fields, CRFs)模型是条件模型中的代表模型,也是条件模型中现在研究得最多的模型之一。它避免了产生式模型的缺点,而且克服了前期最大熵模型标记偏置的缺陷,由此得到广泛的运用。在利用CRFs作具体应用研究时发现,单纯利用CRFs模型进行实际运用取得的效果并没有达到最好,所以在每个应用中均进行了改进。本文主要研究军用文书分词、军事命名实体识别、入侵检测等方面,所做的改进都在模型应用的基础上更进一步提高了系统的性能。

关键词: 条件随机场, 序列标注, 中文分词, 命名实体, 入侵检测, 层叠模型

Abstract:

The conditional probability models gain great developments these years. The conditional models gradually took place of generative models in sequence labeling

problems. It covers a wide range of applications, such as image recognition, natural language processing, intrusion detection and other issues. Conditional Random Fields is representative of conditional models and becomes one of the most popular models, for it not only overcomes the shortcomings of generative models but also defeats the label bias problem of Maximum Entropy Model. That’s why it’s very popular. But when CRFs is used for specific applications, it’s found that the results may not achieve the best. So in every specific application some improvements are made except for the CRFs model itself. The research includes military commands segmentation, military named entity recognition, ntrusion detection, etc. All these specifications are made on the basis of the CRFs model and the system performances are greatly improved.

Key words: conditional random fields, sequence labeling, Chinese word segmentation, named entity, intrusion detection, layered framework